# Continued Pretraining
Gemma 2 Llama Swallow 27b It V0.1
A Japanese-enhanced large language model based on the Gemma-2 architecture, significantly improving Japanese capabilities while retaining original English proficiency
Large Language Model
Transformers Supports Multiple Languages

G
tokyotech-llm
27
1
Taiwan Tinyllama V1.0 Chat
Apache-2.0
This is a Tinyllama model specifically optimized for Traditional Chinese through continued pretraining, based on the TinyLlama-1.1B architecture with a pretraining dataset of approximately 2 billion tokens.
Large Language Model
Transformers Chinese

T
DavidLanz
31
3
Llama 3 Youko 8b
A Japanese-optimized model based on Meta-Llama-3-8B, continuously pretrained on a mixed dataset of Japanese and English with 22 billion tokens
Large Language Model
Transformers Supports Multiple Languages

L
rinna
1,249
60
Saul 7B Base
MIT
A large instruction-based language model customized for the legal domain, obtained through continued pretraining on Mistral-7B
Large Language Model
Transformers English

S
Equall
484
29
Swallow MS 7b V0.1
Apache-2.0
Swallow-MS-7b-v0.1 is a Japanese-enhanced model based on Mistral-7B-v0.1 with continued pretraining, developed by TokyoTech-LLM, demonstrating excellent performance on Japanese tasks.
Large Language Model
Transformers Supports Multiple Languages

S
tokyotech-llm
736
27
Featured Recommended AI Models